Close

1. Identity statement
Reference TypeConference Paper (Conference Proceedings)
Sitesibgrapi.sid.inpe.br
Holder Codeibi 8JMKD3MGPEW34M/46T9EHH
Identifier8JMKD3MGPEW34M/45CU66B
Repositorysid.inpe.br/sibgrapi/2021/09.06.18.15
Last Update2021:09.06.18.15.01 (UTC) administrator
Metadata Repositorysid.inpe.br/sibgrapi/2021/09.06.18.15.01
Metadata Last Update2022:09.10.00.16.17 (UTC) administrator
Citation KeyMaiaVieiPedr:2021:ViRhCo
TitleVisual rhythm-based convolutional neural networks and adaptive fusion for a multi-stream architecture applied to human action recognition
FormatOn-line
Year2021
Access Date2024, Apr. 29
Number of Files1
Size939 KiB
2. Context
Author1 Maia, Helena de Almeida
2 Vieira, Marcelo Bernardes
3 Pedrini, Helio
Affiliation1 UNICAMP
2 UFJF
3 UNICAMP
EditorPaiva, Afonso
Menotti, David
Baranoski, Gladimir V. G.
Proença, Hugo Pedro
Junior, Antonio Lopes Apolinario
Papa, João Paulo
Pagliosa, Paulo
dos Santos, Thiago Oliveira
e Sá, Asla Medeiros
da Silveira, Thiago Lopes Trugillo
Brazil, Emilio Vital
Ponti, Moacir A.
Fernandes, Leandro A. F.
Avila, Sandra
e-Mail Addresshelena.maia@ic.unicamp.br
Conference NameConference on Graphics, Patterns and Images, 34 (SIBGRAPI)
Conference LocationGramado, RS, Brazil (virtual)
Date18-22 Oct. 2021
PublisherSociedade Brasileira de Computação
Publisher CityPorto Alegre
Book TitleProceedings
Tertiary TypeMaster's or Doctoral Work
History (UTC)2021-09-06 18:28:31 :: helena.maia@ic.unicamp.br -> administrator :: 2021
2022-09-10 00:16:17 :: administrator -> :: 2021
3. Content and structure
Is the master or a copy?is the master
Content Stagecompleted
Transferable1
Keywordsaction recognition
visual rhythm
multi-stream architecture
AbstractIn this work, we address the problem of human action recognition in videos. We propose and analyze a multi-stream architecture containing image-based networks pre-trained on the large ImageNet. Different image representations are extracted from the videos to feed the streams, in order to provide complementary information for the system. Here, we propose new streams based on visual rhythm that encodes longer-term information when compared to still frames and optical flow. Our main contribution is a stream based on a new variant of the visual rhythm called Learnable Visual Rhythm (LVR) formed by the outputs of a deep network. The features are collected at multiple depths to enable the analysis of different abstraction levels. This strategy significantly outperforms the handcrafted version on the UCF101 and HMDB51 datasets. We also investigate many combinations of the streams to identify the modalities that better complement each other. Experiments conducted on the two datasets show that our multi-stream network achieved competitive results compared to state-of-the-art approaches.
Arrangementurlib.net > SDLA > Fonds > SIBGRAPI 2021 > Visual rhythm-based convolutional...
doc Directory Contentaccess
source Directory Contentthere are no files
agreement Directory Content
agreement.html 06/09/2021 15:15 1.3 KiB 
4. Conditions of access and use
data URLhttp://urlib.net/ibi/8JMKD3MGPEW34M/45CU66B
zipped data URLhttp://urlib.net/zip/8JMKD3MGPEW34M/45CU66B
Languageen
Target Filecamera_ready.pdf
User Grouphelena.maia@ic.unicamp.br
Visibilityshown
5. Allied materials
Mirror Repositorysid.inpe.br/banon/2001/03.30.15.38.24
Next Higher Units8JMKD3MGPEW34M/45PQ3RS
Citing Item Listsid.inpe.br/sibgrapi/2021/11.12.11.46 4
Host Collectionsid.inpe.br/banon/2001/03.30.15.38
6. Notes
Empty Fieldsarchivingpolicy archivist area callnumber contenttype copyholder copyright creatorhistory descriptionlevel dissemination documentstage doi edition electronicmailaddress group isbn issn label lineage mark nextedition notes numberofvolumes orcid organization pages parameterlist parentrepositories previousedition previouslowerunit progress project readergroup readpermission resumeid rightsholder schedulinginformation secondarydate secondarykey secondarymark secondarytype serieseditor session shorttitle sponsor subject tertiarymark type url versiontype volume


Close